Arabic Gloss WSD Using BERT
نویسندگان
چکیده
Word Sense Disambiguation (WSD) aims to predict the correct sense of a word given its context. This problem is extreme importance in Arabic, as written words can be highly ambiguous; 43% diacritized have multiple interpretations and percentage increases 72% for non-diacritized words. Nevertheless, most Arabic text does not diacritical marks. Gloss-based WSD methods measure semantic similarity or overlap between context target that needs disambiguated dictionary definition (gloss word). gloss suffers from lack context-gloss datasets. In this paper, we present an gloss-based technique. We utilize celebrated Bidirectional Encoder Representation Transformers (BERT) build two models efficiently perform WSD. These trained with few training samples since they BERT were pretrained on large corpus. Our experimental results show our outperform recent WSDs when test them against same data used evaluate model. Additionally, model achieves F1-score 89% compared best-reported 85% knowledge-based Another contribution paper introducing benchmark may help overcome standardized
منابع مشابه
Assessing Gloss of Tooth using Digital Imaging
The aim of this study was to assess gloss of tooth by digital photography. A gonio-imaging system (gonio being Greek for angle) was developed to measure the gloss of human teeth in a laboratorial stage. Polarised and non -polarised images were acquired around the specular angle. The gloss component was extracted and normalised to a theoretical standard; a BRDF curve was built to describe the gl...
متن کاملSussx: WSD using Automatically Acquired Predominant Senses
We introduced a method for discovering the predominant sense of words automatically using raw (unlabelled) text in (McCarthy et al., 2004) and participated with this system in SENSEVAL3. Since then, we worked on further developing ideas to improve upon the base method. In the current paper we target two areas where we believe there is potential for improvement. In the first one we address the f...
متن کاملUsing Semantic Classification Trees for WSD
This paper describes the evaluation of a WSD method within SENSEVAL. This method is based on Semantic Classification Trees (SCTs) and short context dependencies between nouns and verbs. The training procedure creates a binary tree for each word to be disambiguated. SCTs are easy to implement and yield some promising results. The integration of linguistic knowledge could lead to substantial impr...
متن کاملNeighbors Help: Bilingual Unsupervised WSD Using Context
Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recen...
متن کاملOE: WSD Using Optimal Ensembling (OE) Method
Optimal ensembling (OE) is a word sense disambiguation (WSD) method using word-specific training factors (average positive vs negative training per sense, posex and negex) to predict best system (classifier algorithm / applicable feature set) for given target word. Our official entry (OE1) in Senseval-4 Task 17 (coarse-grained English lexical sample task) contained many design flaws and thus fa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2021
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app11062567